Serveur d'exploration MERS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

RNA-Skim: a rapid method for RNA-Seq quantification at transcript level

Identifieur interne : 001B11 ( Main/Exploration ); précédent : 001B10; suivant : 001B12

RNA-Skim: a rapid method for RNA-Seq quantification at transcript level

Auteurs : Zhaojun Zhang [États-Unis] ; Wei Wang [États-Unis]

Source :

RBID : PMC:4058932

Descripteurs français

English descriptors

Abstract

Motivation: RNA-Seq technique has been demonstrated as a revolutionary means for exploring transcriptome because it provides deep coverage and base pair-level resolution. RNA-Seq quantification is proven to be an efficient alternative to Microarray technique in gene expression study, and it is a critical component in RNA-Seq differential expression analysis. Most existing RNA-Seq quantification tools require the alignments of fragments to either a genome or a transcriptome, entailing a time-consuming and intricate alignment step. To improve the performance of RNA-Seq quantification, an alignment-free method, Sailfish, has been recently proposed to quantify transcript abundances using all k-mers in the transcriptome, demonstrating the feasibility of designing an efficient alignment-free method for transcriptome quantification. Even though Sailfish is substantially faster than alternative alignment-dependent methods such as Cufflinks, using all k-mers in the transcriptome quantification impedes the scalability of the method.

Results: We propose a novel RNA-Seq quantification method, RNA-Skim, which partitions the transcriptome into disjoint transcript clusters based on sequence similarity, and introduces the notion of sig-mers, which are a special type of k-mers uniquely associated with each cluster. We demonstrate that the sig-mer counts within a cluster are sufficient for estimating transcript abundances with accuracy comparable with any state-of-the-art method. This enables RNA-Skim to perform transcript quantification on each cluster independently, reducing a complex optimization problem into smaller optimization tasks that can be run in parallel. As a result, RNA-Skim uses <4% of the k-mers and <10% of the CPU time required by Sailfish. It is able to finish transcriptome quantification in <10 min per sample by using just a single thread on a commodity computer, which represents >100 speedup over the state-of-the-art alignment-based methods, while delivering comparable or higher accuracy.

Availability and implementation: The software is available at http://www.csbio.unc.edu/rs.

Contact:weiwang@cs.ucla.edu

Supplementary information:Supplementary data are available at Bioinformatics online.


Url:
DOI: 10.1093/bioinformatics/btu288
PubMed: 24931995
PubMed Central: 4058932


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">RNA-Skim: a rapid method for RNA-Seq quantification at transcript level</title>
<author>
<name sortKey="Zhang, Zhaojun" sort="Zhang, Zhaojun" uniqKey="Zhang Z" first="Zhaojun" last="Zhang">Zhaojun Zhang</name>
<affiliation wicri:level="4">
<nlm:aff wicri:cut=" and" id="btu288-AFF1">Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC</wicri:regionArea>
<placeName>
<region type="state">Caroline du Nord</region>
<settlement type="city">Chapel Hill (Caroline du Nord)</settlement>
</placeName>
<orgName type="university">Université de Caroline du Nord à Chapel Hill</orgName>
</affiliation>
</author>
<author>
<name sortKey="Wang, Wei" sort="Wang, Wei" uniqKey="Wang W" first="Wei" last="Wang">Wei Wang</name>
<affiliation wicri:level="2">
<nlm:aff id="btu288-AFF1">Department of Computer Science, University of California, Los Angeles, CA, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science, University of California, Los Angeles, CA</wicri:regionArea>
<placeName>
<region type="state">Californie</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">24931995</idno>
<idno type="pmc">4058932</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4058932</idno>
<idno type="RBID">PMC:4058932</idno>
<idno type="doi">10.1093/bioinformatics/btu288</idno>
<date when="2014">2014</date>
<idno type="wicri:Area/Pmc/Corpus">000B13</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">000B13</idno>
<idno type="wicri:Area/Pmc/Curation">000B13</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">000B13</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000F89</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000F89</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="RBID">pubmed:24931995</idno>
<idno type="wicri:Area/PubMed/Corpus">001936</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001936</idno>
<idno type="wicri:Area/PubMed/Curation">001936</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">001936</idno>
<idno type="wicri:Area/PubMed/Checkpoint">001768</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">001768</idno>
<idno type="wicri:Area/Ncbi/Merge">000E11</idno>
<idno type="wicri:Area/Ncbi/Curation">000E11</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000E11</idno>
<idno type="wicri:doubleKey">1367-4803:2014:Zhang Z:rna:skim:a</idno>
<idno type="wicri:Area/Main/Merge">001B19</idno>
<idno type="wicri:Area/Main/Curation">001B11</idno>
<idno type="wicri:Area/Main/Exploration">001B11</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">RNA-Skim: a rapid method for RNA-Seq quantification at transcript level</title>
<author>
<name sortKey="Zhang, Zhaojun" sort="Zhang, Zhaojun" uniqKey="Zhang Z" first="Zhaojun" last="Zhang">Zhaojun Zhang</name>
<affiliation wicri:level="4">
<nlm:aff wicri:cut=" and" id="btu288-AFF1">Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC</wicri:regionArea>
<placeName>
<region type="state">Caroline du Nord</region>
<settlement type="city">Chapel Hill (Caroline du Nord)</settlement>
</placeName>
<orgName type="university">Université de Caroline du Nord à Chapel Hill</orgName>
</affiliation>
</author>
<author>
<name sortKey="Wang, Wei" sort="Wang, Wei" uniqKey="Wang W" first="Wei" last="Wang">Wei Wang</name>
<affiliation wicri:level="2">
<nlm:aff id="btu288-AFF1">Department of Computer Science, University of California, Los Angeles, CA, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer Science, University of California, Los Angeles, CA</wicri:regionArea>
<placeName>
<region type="state">Californie</region>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Bioinformatics</title>
<idno type="ISSN">1367-4803</idno>
<idno type="eISSN">1367-4811</idno>
<imprint>
<date when="2014">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Algorithms</term>
<term>Animals</term>
<term>Female</term>
<term>Gene Expression Profiling (methods)</term>
<term>Genome</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Male</term>
<term>Mice</term>
<term>Mice, Inbred C57BL</term>
<term>Sequence Analysis, RNA (methods)</term>
<term>Software</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Algorithmes</term>
<term>Analyse de profil d'expression de gènes ()</term>
<term>Analyse de séquence d'ARN ()</term>
<term>Animaux</term>
<term>Femelle</term>
<term>Génome</term>
<term>Logiciel</term>
<term>Mâle</term>
<term>Souris</term>
<term>Souris de lignée C57BL</term>
<term>Séquençage nucléotidique à haut débit</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en">
<term>Gene Expression Profiling</term>
<term>Sequence Analysis, RNA</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Algorithms</term>
<term>Animals</term>
<term>Female</term>
<term>Genome</term>
<term>High-Throughput Nucleotide Sequencing</term>
<term>Male</term>
<term>Mice</term>
<term>Mice, Inbred C57BL</term>
<term>Software</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Algorithmes</term>
<term>Analyse de profil d'expression de gènes</term>
<term>Analyse de séquence d'ARN</term>
<term>Animaux</term>
<term>Femelle</term>
<term>Génome</term>
<term>Logiciel</term>
<term>Mâle</term>
<term>Souris</term>
<term>Souris de lignée C57BL</term>
<term>Séquençage nucléotidique à haut débit</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>
<bold>Motivation:</bold>
RNA-Seq technique has been demonstrated as a revolutionary means for exploring transcriptome because it provides deep coverage and base pair-level resolution. RNA-Seq quantification is proven to be an efficient alternative to Microarray technique in gene expression study, and it is a critical component in RNA-Seq differential expression analysis. Most existing RNA-Seq quantification tools require the alignments of fragments to either a genome or a transcriptome, entailing a time-consuming and intricate alignment step. To improve the performance of RNA-Seq quantification, an alignment-free method, Sailfish, has been recently proposed to quantify transcript abundances using all k-mers in the transcriptome, demonstrating the feasibility of designing an efficient alignment-free method for transcriptome quantification. Even though Sailfish is substantially faster than alternative alignment-dependent methods such as Cufflinks, using all k-mers in the transcriptome quantification impedes the scalability of the method.</p>
<p>
<bold>Results:</bold>
We propose a novel RNA-Seq quantification method, RNA-Skim, which partitions the transcriptome into disjoint transcript clusters based on sequence similarity, and introduces the notion of sig-mers, which are a special type of k-mers uniquely associated with each cluster. We demonstrate that the sig-mer counts within a cluster are sufficient for estimating transcript abundances with accuracy comparable with any state-of-the-art method. This enables RNA-Skim to perform transcript quantification on each cluster independently, reducing a complex optimization problem into smaller optimization tasks that can be run in parallel. As a result, RNA-Skim uses <4% of the k-mers and <10% of the CPU time required by Sailfish. It is able to finish transcriptome quantification in <10 min per sample by using just a single thread on a commodity computer, which represents >100 speedup over the state-of-the-art alignment-based methods, while delivering comparable or higher accuracy.</p>
<p>
<bold>Availability and implementation:</bold>
The software is available at
<ext-link ext-link-type="uri" xlink:href="http://www.csbio.unc.edu/rs">http://www.csbio.unc.edu/rs</ext-link>
.</p>
<p>
<bold>Contact:</bold>
<email>weiwang@cs.ucla.edu</email>
</p>
<p>
<bold>Supplementary information:</bold>
<ext-link ext-link-type="uri" xlink:href="http://bioinformatics.oxfordjournals.org/lookup/suppl/doi:10.1093/bioinformatics/btu288/-/DC1">Supplementary data</ext-link>
are available at
<italic>Bioinformatics</italic>
online.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Au, Kf" uniqKey="Au K">KF Au</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bloom, Bh" uniqKey="Bloom B">BH Bloom</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dadgar, A" uniqKey="Dadgar A">A Dadgar</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Flicek, P" uniqKey="Flicek P">P Flicek</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fu, Cp" uniqKey="Fu C">CP Fu</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Grabherr, Mg" uniqKey="Grabherr M">MG Grabherr</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Griebel, T" uniqKey="Griebel T">T Griebel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Guttman, M" uniqKey="Guttman M">M Guttman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hsieh, W" uniqKey="Hsieh W">W Hsieh</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Karp, Rm" uniqKey="Karp R">RM Karp</name>
</author>
<author>
<name sortKey="Rabin, Mo" uniqKey="Rabin M">MO Rabin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kurtz, S" uniqKey="Kurtz S">S Kurtz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Langmead, B" uniqKey="Langmead B">B Langmead</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Le, Hs" uniqKey="Le H">HS Le</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Li, B" uniqKey="Li B">B Li</name>
</author>
<author>
<name sortKey="Dewey, Cn" uniqKey="Dewey C">CN Dewey</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marcais, G" uniqKey="Marcais G">G Marcais</name>
</author>
<author>
<name sortKey="Kingsford, C" uniqKey="Kingsford C">C Kingsford</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Melsted, P" uniqKey="Melsted P">P Melsted</name>
</author>
<author>
<name sortKey="Pritchard, Jk" uniqKey="Pritchard J">JK Pritchard</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Nicolae, M" uniqKey="Nicolae M">M Nicolae</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ozsolak, F" uniqKey="Ozsolak F">F Ozsolak</name>
</author>
<author>
<name sortKey="Milos, Pm" uniqKey="Milos P">PM Milos</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pachter, L" uniqKey="Pachter L">L Pachter</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Patro, R" uniqKey="Patro R">R Patro</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pruitt, Kd" uniqKey="Pruitt K">KD Pruitt</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rizk, G" uniqKey="Rizk G">G Rizk</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Roberts, A" uniqKey="Roberts A">A Roberts</name>
</author>
<author>
<name sortKey="Pachter, L" uniqKey="Pachter L">L Pachter</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Trapnell, C" uniqKey="Trapnell C">C Trapnell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Trapnell, C" uniqKey="Trapnell C">C Trapnell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Trapnell, C" uniqKey="Trapnell C">C Trapnell</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Turro, E" uniqKey="Turro E">E Turro</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Uziela, K" uniqKey="Uziela K">K Uziela</name>
</author>
<author>
<name sortKey="Honkela, A" uniqKey="Honkela A">A Honkela</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, K" uniqKey="Wang K">K Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wang, Z" uniqKey="Wang Z">Z Wang</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhang, Z" uniqKey="Zhang Z">Z Zhang</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>Californie</li>
<li>Caroline du Nord</li>
</region>
<settlement>
<li>Chapel Hill (Caroline du Nord)</li>
</settlement>
<orgName>
<li>Université de Caroline du Nord à Chapel Hill</li>
</orgName>
</list>
<tree>
<country name="États-Unis">
<region name="Caroline du Nord">
<name sortKey="Zhang, Zhaojun" sort="Zhang, Zhaojun" uniqKey="Zhang Z" first="Zhaojun" last="Zhang">Zhaojun Zhang</name>
</region>
<name sortKey="Wang, Wei" sort="Wang, Wei" uniqKey="Wang W" first="Wei" last="Wang">Wei Wang</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001B11 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001B11 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    MersV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     PMC:4058932
   |texte=   RNA-Skim: a rapid method for RNA-Seq quantification at transcript level
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i   -Sk "pubmed:24931995" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd   \
       | NlmPubMed2Wicri -a MersV1 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Apr 20 23:26:43 2020. Site generation: Sat Mar 27 09:06:09 2021